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Abstract — We extend stochastic network optimization tlieory 
to treat networks with arbitrary sample paths for arrivals, 
channels, and mobility. The network can experience unexpected 
link or node failures, traffic bursts, and topology changes, and 
there are no probabilistic assumptions describing these time 
varying events. Performance of our scheduling algorithm is 
compared against an ideal T-slot lookahead policy that can 
make optimal decisions based on knowledge up to T-slots into 
the future. We develop a simple non-anticipating algorithm that 
provides network throughput-utility that is arbitrarily close to (or 
better than) that of the T-slot lookahead policy, with a tradeoff 
in the worst case queue backlog kept at any queue. The same 
policy offers even stronger performance, closely matching that of 
an ideal infinite lookahead policy, when ergodic assumptions are 
imposed. Our analysis uses a sample path version of Lyapunov 
drift and provides a methodology for optimizing time averages 
in general time-varying optimization problems. 

Index Terms — Queueing analysis, opportunistic scheduling, 
internet, routing, flow control, wireless networks, optimization 



I. Introduction 

Networks experience unexpected events. Consider the net- 
work of Fig. [T] and focus on the session that sends a stream of 
packets from node A to node D. Suppose that several paths 
are used, but due to congestion on other links, the primary 
path that can deliver the most data is the path A, B, C, D. 
However, suppose that there is a failure at node B in the 
middle of the session. An algorithm with perfect knowledge of 
the future would take advantage of the path A,B,C,D while 
it is available, and would switch to alternate paths before the 
failure occurs. The algorithm would also be able to predict 
the traffic load on different links at different times, and would 
optimally route in anticipation of these events. 

The above example holds if the network of Fig. [T| is a 
wireline network, a wireless network, or a mixture of wired 
and wireless connections. As another example, suppose the 
network contains an additional mobile wireless node E, and 
that the following unexpected event occurs: Node E moves 
into close proximity to node A, allowing a large number of 
packets to be sent to it. It then moves into close proximity to 
node D, providing an opportunity to transmit packets to this 
destination node. If this event could be anticipated, we could 
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Fig. 1. A primary path from A to D, with alternative paths shown in the 
event of a failure at node B. 



take advantage of it and improve the short term throughput by 
routing many packets over the relay E. 

These examples illustrate different types of unexpected 
events that can be exploited to improve performance. There are 
of course even more complex sequences of arrival, channel, 
and mobility events that, if known in advance, could be 
exploited to yield improved performance. However, because 
realistic networks do not have knowledge of the future, it is 
not clear if these events can be practically used. Surprisingly, 
this paper shows that it is possible to reap the benefits of these 
time varying events without any knowledge of the future. We 
show that a simple non-anticipating policy can closely track 
the performance of an ideal T-slot lookahead policy that has 
perfect knowledge of the future up to T slots. Proximity to 
the performance of the T-slot lookahead policy comes with a 
corresponding tradeoff in the worst case queue backlog stored 
in any queue of the network, which also affects a tradeoff in 
network delay. 

Specifically, we treat networks with slotted time with nor- 
malized slots t E {0,1,2,...}. We measure network utility 
over an interval of timeslots according to a concave function of 
the time average throughput vector achieved over that interval. 
We show that for any positive integer frame size T, and 
any interval that consists of R frames of T slots, the utility 
achieved over the interval is greater than or equal to the utility 
achieved by using the T-slot lookahead policy over each of 
the R frames, minus a "fudge factor" that has the form: 



fudge factor - 



B^T BoV 



V 



RT 



where Bi and B2 are constants, and is a positive parameter 
that can be chosen as desired to make the term BiT/V 
arbitrarily small, with a tradeoff in the worst case queue 
backlog that is 0{V). This shows that we reap almost the 
same benefits of knowing the future up to T slots if we 
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choose V suitably large and if we wait for the completion 
of R frames of size T, where R is sufficiently large to make 
B2V/{RT) small. Remarkably, the constants Bi and B2 can 
be explicitly computed in advance, without any assumptions 
on the underlying stochastic processes that describe the time 
varying events. 

This establishes a universal scheduling paradigm that shows 
a single network algorithm can provide strong mathemati- 
cal guarantees for any network and for any time varying 
sample paths. The algorithm that we use is not new: It is 
a modified version of the backpressure based "drift-minus- 
reward" algorithms that we previously developed and used 
in different contexts in our prior work [1][2][3][4]. These 
algorithms were originally developed for the case when new 
arrivals and new channel states are independent and identically 
distributed (i.i.d.) over slots, and were analyzed using a 
Lyapunov drift defined as an expectation over the underlying 
probability distribution. However, it is known that Lyapunov 
based algorithms that are designed under i.i.d. assumptions 
yield similar performance under more general ergodic (but 
non-i.i.d.) probability models. This is shown for network 
stability using a T-slot Lyapunov drift in [5] [6], and using 
a related delayed-queue analysis (that often provides tighter 
delay bounds) in [7]. Further, such algorithms are known 
to be robust to non-ergodic situations such as when traffic 
yields "instantaneous rates" that can vary arbitrarily inside the 
capacity region [8][4][1], and when "instantaneous capacity 
regions" can vary arbitrarily but are assumed to always contain 
the traffic rate vector [9]. However, the prior non-ergodic 
analysis [8][4][1][9] still assumes an underlying probability 
model, and makes assumptions about traffic rates and network 
capacity with respect to this model. 

The analysis in this paper is new and uses a sample path 
version of Lyapunov drift, without any probabilistic assump- 
tions. This framework allows treatment of realistic channels 
and traffic traces. Because arbitrary sample paths may not 
have well defined time averages, typical equilibrium notions of 
network capacity and optimal time average utility cannot be 
used. We thus use a new metric that measures performance 
with respect to ideal T-slot lookahead policies. This is a 
possible framework for treating the important open questions 
identified in [10] concerning non-equilibrium network theory. 
Further, our results provide universal techniques for optimizing 
time averages that are useful for other types of time-varying 
systems. 

A. Comparison to Related Work 

We note that universal algorithms are important in other 
fields. For example, the universal Lempel-Ziv data compres- 
sion algorithm operates on arbitrary files [11], and univer- 
sal stock portfolio allocation algorithms hold for arbitrary 
price sample paths [12][13][14][15]. Our work provides a 
universal approach to network scheduling. It is important to 
note that prior work in the area of competitive ratio analysis 
[16] [17] [18] [19] and adversarial queueing theory [20] also 
considers network scheduling problems with arbitrary sample 
paths, albeit in a different context. Work in [17] considers a 



large class of admission control problems for networks with 
random arrivals that earn revenue if accepted. An algorithm 
is developed that yields revenue that differs by a factor 
of 0(log(A^)) from that of an ideal algorithm with perfect 
knowledge of the future, where N is the number of network 
nodes. Further, this asymptotic ratio is shown to be optimal, 
in the sense that there is always a worst case sequence of 
packet arrivals that can reduce revenue by this amount. Related 
Q{\og{N)) competitive ratio results are developed for energy 
optimization in [18] and for wireless admission control in [19]. 
The works [16] [17] [18] [19] do not consider networks with 
time varying channels or mobility, and do not treat (or exploit) 
network queueing. An adversarial queueing theory example in 
[20] shows that, if channels are time varying, the competitive 
ratio can be much worse than logarithmic, even for a simple 
packet-based network with a single link. 

Our work treats the difficult case of multi-hop networks 
with arbitrary time varying channels, mobility, and penalty 
constraints. However, rather than pursuing a competitive ratio 
analysis, we measure performance against a T-slot lookahead 
metric. We show that we can closely track the performance of 
an ideal T-slot lookahead policy, for any arbitrary (but finite) 
T. This does not imply that the algorithm has an optimal 
competitive ratio, because the utility of a T-slot lookahead 
policy for finite T may not be as good as the performance 
of an infinite lookahead policy. However, it turns out that 
our policy indeed approaches an optimal competitive ratio 
(measured with respect to an infinite lookahead policy) under 
the special case when the time varying events are ergodic. 

Finally, we note that a frame-based metric, similar to our 
T-slot lookahead metric, was used in [20] to treat static 
wireline networks with arbitrary arrivals but fixed topology and 
channel states. There, an algorithm that queues all packets that 
arrive in a frame, solves a network-wide utility maximization 
problem for these packets based on knowledge of the static 
link capacities, and implements the solution in the next frame, 
is shown to achieve revenue that is close to that of a policy 
that a-priori knows the packet arrival times over one frame. In 
our context, we do not have the luxury of solving a network- 
wide utility maximization problem based on known static link 
capacities, because the network itself is changing with time. 
Our solution strategy is thus completely different from that 
of [20]. Rather than a frame-based approach, our algorithm 
makes simple "max-weight" decisions every slot based on a 
quadratic Lyapunov function. It is interesting to note that this 
imposes a "cost" associated with each decision that depends 
on the current network queue state, which is similar in spirit to 
the cost functions used in the algorithms of [16] [17] [18] [19] 
for competitive ratio analysis. 

B. Outline of Paper 

The next section describes the general problem of con- 
strained optimization of time varying systems. Sectionlllllpro- 
vides a universal solution technique that measures performance 
against a T-slot lookahead policy. Section |IV] applies the 
framework to a simple internet model, and Section |V] applies 
the framework to a more extensive class of time varying 
networks. 
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II. General Time Varying Optimization Problems 

Here we provide a framework for universal constrained 
optimization for a general class of time varying systems. 
The framework is applied in Sections |IV] and [V] to solve the 
network problems of interest. 

Consider a slotted system with normalized timeslots t E 
{0,1,2,...}. The system contains K queues with current 
backlog given by the vector Q{t) = {Qi{t), . . . ,QK{t))- 
Let Ld{t) denote a random event that occurs on slot t. The 
random event uj{t) represents a collection of current system 
parameters and takes values in some abstract event space Vl. 
We treat a;(t) as a pre-determined function oftE {0, 1, 2, . . .}, 
although each value is not revealed until the beginning 
of slot t. Every slot, a system controller observes the current 
value of ijj{t) and chooses a control action a{t), constrained to 
some action space A^(^t) that can depend on The random 
event uj{t) and the corresponding control action a{t) G A^(^t) 
produce a service vector b{t) — {bi{t), . . . , bx{t)), an arrival 
vector a{t) = (oi (i), . . . , ax(0)' attribute vectors 

x{t) = ixiit),...,XM{t)), y{t) = iyo{t),yiit),...,yLit)) 
(for some non-negative integers M and L). These vectors are 
general functions of uj{t) and a{t): 

ak{t) = ak{a{t),u{t)) yke{l,...,K} 

bk{t) = bk{a{t),uj{t)) yke{l,...,K} 

Xmit) = Xr^ia{t),oj{t)) yme {!,..., M} 

yi{t) = yi{a{t),uj{t)) V? G {0, 1, . . . , i} 

The queue dynamics are determined by the arrival and 
service variables by: 



Qk{t + l)^ max[gfc(t) - bk{t)M + ak{t) 



(1) 



Let Xm be the time average of Xm{t) under a particular 
control policy implemented over a finite number of slots tend- 

(2) 
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T=0 



Define Tik, bk, yi similarly. Define S=(xi, . . . ,xm)- The goal 
is to design a policy that solves the following time average 
optimization problem: 

Minimize: + fix) (3) 

Subject to: Jji + gi{x) < yi e {!,..., L} (4) 
ak<bk yke{l,...,K} (5) 

xex (6) 

aWG^^(t) Vie{0,...,te„rf-1} (7) 

where f{x), gi{x) are convex cost functions of the vector x = 
{xi, . . . , Xm), and A" is a general convex subset of R^^. The 
above problem is of interest even if there are no underlying 
queues Qk{t) (so that the constraints ^ are removed), and/or 
if A" = R*^ (so that the constraint (|6]l is removed), and/or if 
fi-) = 9i{-)=0. 

The above problem is stated in terms of a finite horizon 
of size tend- The minimum cost in Q is defined for a given 
uj{t) function over t G {0, . . . ,tend ^ 1}. and considers all 
possible control actions that can be implemented over the 



time horizon, including actions that have fuU knowledge of the 
future values of U!{t). However, we desire a non-anticipating 
control policy that only knows the current uj{t) value on each 
slot t, and has no knowledge of the future. We note that a 
theory for solving stochastic network optimization problems 
similar to ©-(iTli (without the set constraint (|6]l) is developed 
in [1] for an infinite horizon context under the assumption 
that uj{t) is i.i.d. over slots with some (possibly unknown) 
probability distribution, and related problems are treated in a 
fluid Umit sense in [21] [22]. Here, we do not consider any 
probability model for uj{t), so that it is not possible to use 
law of large number averaging principles, or to achieve the 
optimum performance in an "expected" sense. Rather, we shall 
deterministically achieve an "approximate" optimum for large 
tend values, to be made precise in future sections. 

A. Boundedness and Feasibility Assumptions 

Here we present assumptions concerning the functions 
aki'), bk{-), Xm{'), yi{') that ensure the problem (I3]l-(l7]l is 
feasible with a bounded infimum cost. 

Assumption Al: The functions dfe(-), bk{-), Xni{-), yi{-) 
are bounded, so that for all uj G {uj{0), . . . ,uj{tend — 1)}, 
afl a G A^, and all k e {1,...,K}, m G {!,..., M}, 
Z e {0, 1, . . . , L} we have: 

0<afc(a,w) <ar" 
0<fefc(a,c^)<6r" 



Vi' 



<yi{a,u)<yY' 



for some finite constants a™"^, fo™"^^, a;™*", a^m"^, y""", 
y-max^ Further, the cost functions f{x) and gi{x) are defined 
over all vectors {xi, . . . , xm) that satisfy a;™*" < Xm < a^m"^ 
for all m € {1,...,M}, and have finite upper and lower 
bounds /™™, jmax^ 5™*", g^°-^ ovcr this region. 

Assumption A2: For all uj e {i^(0), . . . , ijj{tend — 1)}, there 
is at least one control action a'^ e A^^ that satisfies: 

yi{a'^,uj)+gi{x{al,Lo))<Q V/e{l,...,L} (8) 
ak{a^,^) <bk{a'^,uj) Vfc e {1, . . . , i^} (9) 
x{a'^,u:)GX (10) 

where x{a,uj) is defined: 

x{a,uj)^{xi(a,uj), . . .,XM{a,uj)) 

For a given {uj{0), . . . ,uj{tend ~ 1)}, we say that the 
problem (O-dTli is feasible if there exist control actions 
{a{0), . . . ,a{tend — 1)} that satisfy the constraints (|4]l-(|7|l. 
Assumption A2 ensures that the problem is feasible (just 
consider the control actions a{t) — a^^j) for all t, and use 
Jensen's inequality to note that gi{x) < gi{x)). Define F* 
as the infimum value of the cost metric (O over all feasible 
policies. The value F* is finite by Assumption Al. Then for 
any e > 0, there are control actions {a*{Q), ... ,a*{t end — i)} 
that satisfy the constraints (|4l)-(|7|i with a total cost that satisfies: 

F* <ya + f{x*) <F* +e 



4 



Appendix B provides conditions that ensure the infimum F* 
is achievable by a particular policy (i.e., with e = 0). We note 
that Assumption A2 can be relaxed to only require feasibility 
over frames of T slots, although the resulting performance 
bounds in Theorem [T^ and[T}) are slightly altered in this case. 

B. Cost Function Assumptions 

The cost functions f{x), gi{x) are assumed to be convex 
and continuous over the region of all (xi, . . . ,xm) vectors 
that satisfy: 

a;"r" < < ^"r" for all m e {1, . . . , M} (11) 

In addition, we assume that the magnitude of the mth left and 
right partial derivatives of f{x) with respect to Xm are upper 
bounded by finite constants I'm > for all x that satisfy (fTTT i 
and such that a;™™ < Xm < Similarly, the magnitude 

of the right and left partial derivatives of gi{x) are upper 
bounded by finite constants m > 0. This implies that: 

fix + y)< fix) + E™=i I (12) 

giix +y) < giix) + Em=l A,m|ym| (13) 

for all X, y such that x and a; + y are in the region specified 
by (HB- 

An example of a non-differentiable cost function that satis- 
fies all of the above assumptions is: 

fixi, . . . ,xm) = max[a;i, . . .,xm] 

In this case, we have Vm — ^ for all m. Another example is 
a separable cost function: 

M 

/(xi, . . . ,Xm) = ^ /m(a;m) (14) 

m— 1 

where functions fmix) are continuous and convex with deriva- 
tives bounded in magnitude by v„i over x™™ < x < x^'^ . 

C. Applications of This Optimization Framework 

We show in Section [V] that this problem applies to gen- 
eral dynamic networks. There, the Loit) value represents a 
collection of channel conditions for all network links on slot 
t. This includes the simple model where link conditions are 
either in the ON or OFF state, representing connections or 
disconnections that can vary from slot to slot due to fading 
channels and/or user mobility. Each node can discover the state 
of its links by probing to find existing neighbors on the current 
slot. The ait) value represents a collection of routing, resource 
allocation, and/or flow control decisions that are taken by the 
network in reaction to the current value. 

In addition to dynamic networks and queueing systems, 
the problem Q has applications in many other areas that 
involve optimization over time varying systems. For example, 
our recent work in [15] presents applications to stock market 
trading problems. There, a;(t) represents a vector of current 
stock prices, and the constraints < hk ensure that the 

'All convex functions have well defined right and left partial derivatives. 



average amount of stock k sales cannot exceed the average 
amount of stock k purchases. 

We note that Assumption A2, which assumes that for any lo 
there exists an action a'^ that satisfies 0^(0;^, w) < 6fe(a^,a;), 
often holds for systems that have a physical "idle" control 
action that reduces the inequality to < 0. For example, in 
network problems, the values of afe(-) and fefe(-) often represent 
transmission rates, power expenditures, or newly accepted 
jobs, and the idle action is the one that accepts no new arrivals 
into the network and transmits no data over any link of the 
network. For stock market problems, the idle action is often 
the action that neither buys nor sells any shares of stock on 
the current slot. 

D. T-Slot Lookahead Policies 

Rather than consider the optimum of the problem ©-(I?]) 
over the full time interval t e {0, . . . , tend — 1}, we consider 
the minimum cost that can be incurred over successive frames 
of size T, assuming that the time average constraints (IH)-© 
must be achieved over each frame. Specifically, let T be a 
positive integer, representing a. frame size. For a non-negative 
integer r, define F* as the infimum value associated with the 
following problem (where 74(71, . . . , 7m)): 

Minimize: + /(7) (15) 

Subject to: hi + gii'^) < V/ G {1, . . . , L} 

7 e A- 

7™ = T E^-l^T^"' iMr), c^(r)) Vm e {1, . . . , Af} 

hi = ^ Y}r=rT'^ yi{<^ir)Mr)) V? e {0, 1, . . . , L} 

7i:^:^r¥'^\^kiaiT)Mr)) - kiaiT)Mr))] < Vfc 
air) e A^(^r) Vr e {rT, . . . , (r + 1)T - 1} 

The value of F* represents the infimum of the cost metric that 
can be achieved over the frame, considering all policies that 
satisfy the constraints and that have perfect knowledge of the 
future w(t) values over the frame. Our new goal is to design a 
non-anticipating control policy that is implemented over time 
tend ~ RT (for some positive integer R), and that satisfies 
all constraints of the original problem while achieving a total 
cost that is close to (or smaller than) the value of: 

R-l 

r=0 

For R= 1, it is clear that the value in (fTSl l is the same as the 
optimal cost associated with the problem (O with tend — T. 
For i? > 1, it can be shown that the value in (fTSI ) is greater 
than or equal to the optimal cost associated with the problem 
^ for tend = RT. The reason that the problem (I3]l-(l7]l might 
have a strictly smaller cost is that it only requires the time 
average constraints to be met over the full time interval, rather 
than requiring them to be satisfied on each of the R frames. 
Nevertheless, when T is large, it is not trivial to achieve the 
cost value of (fTSI l. as this cost is defined over policies that 
have T-slot lookahead, whereas an actual policy does not have 
future lookahead capabilities. 
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III. Solution to the General Problem 

First note that the problem ©-(IT]! is equivalent to the fol- 
lowing problem, which introduces auxiliary variables 7(t) = 
(7i(t),...,7M(t)) for t e {0,...,tend - 1}: 



Minimize: 
Subject to: 



+ /(7) 



(17) 



fi+.gK7)<0 V/e{i,...,L} (18) 

flfc <6fc Vfce (19) 

7„ -lr„ Vme {1,...,A/} (20) 

a(t) e^^(t) Vt e {0,...,ie„d-l} (21) 

7(i) e A- e {0,...,ie„d- 1} (22) 

X™" < 7m (0 < ^m"" e {0, . . . , te«d " 1} (23) 



where /(7) is defined: 



r=0 



/(7(r)) 



and where 5/(7) is defined similarly. 

To see that the above problem (fT7]i-(l23]l is equivalent to the 
original problem (l3]l-(l7]l, note that any optimal solution of Q- 
(|7]i also satisfies the constraints (fT8]l-(l23]l, with the same value 
of the cost metric ( fTTI l. provided that we define 7^ (i) — Xm 
for all t, with Xm being the time average of Xm{t) under 
the solution to the problem O)-©. Thus, the minimum cost 
metric of the new problem (fT7]i-(l23Tl is less than or equal to 
that of ©-(ITli. On the other hand, by Jensen's inequality and 
convexity of /(7), gii'j), we have for any solution of the new 
problem 



gih) > giii)^gi(x) 

From this it easily follows that the minimum cost metric of 
the new problem (fT7]l-(l23]l is also greater than or equal to that 
of the original problem ©-(iTll. 

Such auxiliary variables are introduced in [2] and [1] to 
optimize functions of time averages, which is very different 
from optimizing time averages of functions. Note that if 
/(.) = gi[-) = for all /, then M = and we do not 
need any auxiliary variables]^ Following the framework of [2] 
[1], which solves problems similar to the above under ergodic 
assumptions on uj{t), in addition to the actual queues Qk{t) we 
define virtual queues Zi{t) and Hra{t) for each / € {1, . . . , L} 
and m e {1, . . . , M}, with Zi{<S) = H^{Q) = for all I and 
m, and with update equation: 

Zi{t + l) = max[Z,(i)+yKi)+5i(7W),0] (24) 

H„,{t+l) = Hra{t) + -1ra{t) ~ X„,{t) (25) 

Note that the queues Qk{t) and Zi{t) are non-negative for 
all t, while the queues H,n{t) can be possibly negative. If 
the queues Zi{t), Qk{t), Hm{t) are close to zero at time tend, 
then the inequality constraints (fTsT l. (fT9] l. ( |20] |. respectively, are 
close to being satisfied, as specified by the following lemma. 

Lemma 1: (Approximate Constraint Satisfaction) For any 
sequence {aj(0), cj(2), . . .}, any designated end time 



tend > 0, any non-negative initial queue states Zi{0), Qk{0), 
any real valued initial queue states Hm{0), and any sequence 
of control decisions a{0) e .4^(0), S Ai^(i), a{2) £ 

■^Lo{2), etc., we have for all / G {1, . . . , L}, k £ {1, . . . , K}: 



Qk < bk + 



Qkitend) ~ Qk{0) 



vi+giix) 



< 



tend 

Zijtend) - ZijO) 
m— 1 



t. 



end 



(26) 

(27) 
(28) 



where e 
satisfy: 



x + ee X 

(ei,...,eAf) is a vector with components that 



\Hm{tend) — Hrn{0)\ 



end 



Vme {!,..., M} 



where x — {xi, . . . , xm) represents a time average over the 
first tend slots, as defined in (|2|l, as do time averages a^, bk, 
Vi- 

Note that the inequalities (I26b-(l28b correspond to the desired 
constraints (IH)-®, and show that these desired constraints 
are approximately satisfied if the values of Qk{t end) /tend, 
Zi{tend)/tend, \Hm{tend)\ I tend are Small. Recall that 
are bounds on the partial derivatives of gi{x), and hence if 
for a particular I we have gi{x) — for all x, then /3;.„i = 
for all m, which tightens the bound in dZTl i. This is useful for 
linear cost functions, as described in more detail in Section 
IllED] 

Proof: (Lemma [TJ From the queueing update equation ([T]) 
we have for any fc € {1, . . . , K} and all t>Q: 

Qk{t + l)>Qk{t)-bk{t)+ak{t) 

, t — 1} for a given t > 0, and 



Summing over r e {0, 1 
dividing by t gives: 



^ t-i ^ t-i 

-^afe(T) < -Y.bk{T) + 



Qk{t)-Qk{0) 



t 



r=0 T=0 

Plugging t = tend into the above inequality proves ( |26] l. 



Similarly, the update equation 
following for all m e {1, . . . , M}: 



^ t-i ^ t-i 

T=0 T=0 



easily leads to the 

H,n{t)-Hn,m 
t 



Using t = tend in the above yields: 

,_ _ I \Hm{tend) — -ffm(0)| 
l7m ~ ■'^•m I — , 



(29) 



Note that 7(r) e X for all t e {0, . . . , tend — 1}, and hence 
(by convexity of X) the time average 7 is also in X. Defining 
e=7 — X thus ensures that x + e e X. Noting that £,„ = 
7^ — Xm and using ( |29] | proves ( |28] |. 

Finally, from (|24] | we have for any I e {1, . . . , i} and for 
all slots t > 0: 



^Also, in this case M = 0, we do not need any queues of the type i25) . 



Zi{t + l) > Zi{t) + yi{t) + gi{-i{t)) 



6 



Summing over r G {0, . . . ,t — 1} for any time t > yields: 



t-i 



Zi{t) - Zi{0) >Y.m{T) + Y.m{i{T)) 



Dividing by t and rearranging terms proves that: 



1 

-^Y.^yi{r) + gi{l{r))] < 



Ziit)-Zi{0) 



t 



(30) 



Using t ~ t^nd in the above inequality, together with Jensen's 
inequality for the convex function gi{-), yields: 

Zl{te„d) - Zl{0) 



Vi+gih) < 



(31) 



where 7 is the time average of -/{t) over the first tend slots. 
However, by ( fTsT i we have: 



M 



91 



Using this and (|29l) in ((ST) proves (|27t . 



□ 



A. Quadratic Lyapunov Functions and Sample Path Drift 

Lemma [T] ensures the constraints (fT8])-(l23]) (and hence the 
constraints (|4]i-(|7]i) are approximately satisfied if the final 
queue states of all queues are small relative to tend- Define 
0(t) as a vector of all current queue states: 

@{t)MZ{t),Q{t),H{t)] 

where Z{t), Q{t), H{t) are vectors with entries Zi{t), Qk{t), 
Hm{t), respectively. As a scalar measure of queue size, define 
the following quadratic Lyapunov function, as in [1]: 

L K ^ M 

L{e{t))A-Y,zi{tr+-j2Q,itr+-Y,H„,{t)' 



1=1 



k=l 



For a given positive integer T, let Arit) represent the T-slot 
sample path Lyapunov drift associated with particular controls 
implemented over the interval {t, . . . ,t + T — 1} when the 
queues have state @{t) at the beginning of the interval]^ 



AT{t)m®{t + T))-L{@{t)) 



(33) 



This notion of T-slot drift differs from that given in [1] in that 
it does not involve an expectation. It is difficult to control the 
T-slot drift, because it depends on future (and hence unknown) 
ijj{t) values. Thus, following the approach in [1], we design 
a control policy that, every slot t, observes the current ijj{t) 
value and the current queue states ®{t) and chooses a control 
action a{t) S A^[t) to minimize a weighted sum of the 1-slot 
drift and the current contribution to the cost metric ( fTTI l: 

Minimize: Ai(i) + VyQ{a{t),uj{t)) + F/(7(i)) 
Subject to: Constraints (I2ni-(|23]| 

where F > is a control parameter chosen in advance that 
affects a performance tradeoff. Rather than perform the exact 

^Note that the value of ^T^t) depends on the queue state ®{t) at the start 
of the T-slot interval, the random events {a;(t), . . . , aj(t + T — 1)}, and the 
control actions . . . , a(t + T — 1)} that are chosen over this interval. 



minimization of the above problem, it suffices to minimize a 
bound. The following lemma bounds Ai(t). 

Lemma 2: Under Assumption Al, the 1-slot drift Ai(t) 
satisfies: 



A^{t) < B + Y,Zi{t)[yi{t) + gii-^m 
1=1 

K 

■Y,Qk{t)[ak{t)-hk{t)] 

k=l 

I 

H.m{t)Ylni{t) - Xni(t)] 



M 



m—1 

where S is a finite constant that satisfies for all t 

L , K 



B > lY.(y^{t)+mhm' + lT.^b,{tr+a,{tf] 

1=1 k=l 
1 

m—1 

Such a finite constant B exists by the boundedness assump- 
tions Al, and a particular such B is given in Appendix E. 

Proof: Squaring the Zi{t) update equation (l24l l and noting 
that max[a;,0]^ < yields: 

Ziit + lf < Ziitf + {yiit) + giijm' 

+2Zi{t)[yiit) + gii-rm (35) 

Similarly, from ([TJ we have: 

Qkit + l)^ < {Qk{t)-hk{t)f +ak{tf 

+2ak{t) uim[Qk{t) - hk{t)M 

< (Qkit) - hu(t)f + ak{tf + 2ak(t)Qk{t) 

= Qk{tf +ak{tf +bk{tf 

+2Qk{t)[ak{t)-hk{t)] (36) 



Finally, from (1251 1 we have: 

Hn,(t + lf = i/r„(t)' + (7r«(i)-a;mW)' 

+2i/™(t)(7„(t)-a;„(t)) (37) 

Summing (|35] |. (|36] |. JjTl l and dividing by 2 yields the result. 

□ 

The above lemma shows that: 

Ai(t) + T/yo(a(i),c^(i)) + Vf{-t{t)) < 
B + Vyo{a{t),^it)) + Vfh{t)) 



Y,Zi{t)[yi{c^{t),u;{t)) + gihm 
1=1 

J2 Qk{t)[ak{a{t), uj{t)) - bk{a{t), uj{t))] 

M 

Hm{t)bUt) ~ Xr„{a{t),Uj{t))] (38) 



K 



k=l 



B. The General Universal Scheduling Algorithm 

Our universal scheduling algorithm is designed to minimize 
the right hand side of dJST l every slot, as described as follows: 
Every slot t, observe the random event w(t) and the current 
queue backlogs Zi{t), Qk{t), Hm{t) for all / g 
A; S {1, . . . , K}, m S {1, . . . , M}, and perform the following. 

• Choose 7(i) = (71 (i), . . . , 7A/(i)) to solve the following: 

Minimize: Vf{-f{t)) + Zi{t)gi{-f(t)) 

7(t) e X 



Subject to: 



X™" < lra{t) < a^r" Vm e {1, . . . , M} 



Choose a{t) € -4^(t) to minimize: 

L 

Vyo{a{t), ujit)) + J2 Zi{t)yi{a{t),uj{t)) 



E 



Hm{t)Xrnia{t),Uj{t)) 



K 



+ J2Qkit){akia{t),ij{t))^bk{ait),ujit))] (39) 

k=l 

• Update the actual queues Qfc(t) and the virtual queues 
Zi{t), Hm{t) for all S {1,...,K}, I € 
m e {1, . . . , M} via ©, (Elll, and (IB). 

Note that the above selection of ^{t) minimizes a convex 
function over a convex set, and decomposes into M decoupled 
convex optimizations of one variable in the case when cost 
functions f{x) and gi{x) have the separable structure of (fT4l i 
and when the set X is equal to or a hypercube in M^. The 
optimization of a{t) in ( [39] l may be more complex and is pos- 
sibly a non-convex or combinatorial problem (depending on 
the Au:(t) set and the Xm{-), yi{-), bk{-), and afc(-) functions). 
However, it is simple when the action space Ai^^t) contains 
only a finite (and small) number of options, in which case we 
can simply compare the functional ( |39] l for each option. A key 
property of the above algorithm is that it is non-anticipating in 
that it acts only on the current uj{t) value, without knowledge 
of future values. 

It can be shown that the expression (3% has a well defined 
minimum value over the set ^^(4) whenever Assumption 
A4 in Appendix B holds. However, the next theorem allows 
for approximate minimization, where the choice of decision 
variables 7(t) and a{t) lead to a value that is off by an additive 
constant from achieving the minimum (or infimum) of the right 
hand side in (l38T l. This is similar to the approximation results 
in [1] and references therein, developed for ergodic problems. 
Specifically, we define an algorithm to be C-approximate if 
every slot it makes decisions 7(<) and a{t) to satisfy the 
constraints (I2ni-(|23]| and to yield either the infimum of the 
expression on the right hand side of ( |38] | (as described in the 
algorithm above), or to yield a value on the right hand side 
that differs from the infimum by at most an additive constant 
C> 0. 

Theorem 1: Suppose Assumptions Al and A2 hold. Con- 
sider any C-approximate algorithm. Let the random event 
sequence {lu{0) , uj{l) , lu{2) , . . .} be arbitrary. Then: 



L 

E 

1=1 



(a) For any slot t > we have: 

K M 

Zi{tf +Y.Qk{tf + Hm{tf < tVC^+2L{&{0)) 



k=l 



m— 1 



where the constant Cq is defined: 



Cot^2[{B + C)/V + (y^- - y^") + ^ /™")] 

where B is the non-negative constant defined in ( |34] |. In 
particular, all queues are bounded as follows: 

Zi{t),Qkit),\Hr^{t)\ < ^tVCl + 2L{&{Q)) 



This bound becomes CovtV if all queues are initially empty. 

(b) If all queues are initially empty, then for any designated 
time tend > we have: 



yi+giix) < C01 



1 + J2m=l 



ak < bk + Co\/j^ 



,K} 



V/e {!,., 
Vfc e {!,. 

x + e{t) e X 

where e{t) = (ei(i), . . . , f-M{t)) has entries that satisfy: 

V T^end 

Thus, the constraints (|7]i of the original problem are satisfied, 
and the constraints dUi-® of the original problem are approx- 
imately satisfied, where the error term in the approximation 
decays with tend according to a constant multiple of \/V jtend- 
(c) Consider any positive integer frame size T, any positive 
integer R, and define t^nd — RT. Then the value of the system 
cost metric over tp 



end slots Satisfies: 

R-l 



B + C D{T-1) 



VRT 



r=0 
M 

E- 

rn— 1 



V V 

\H„,iRT) - H,n{0)\ 
RT 



(40) 



where |/„j, x are time averages over the first tend slots, and 
where F* is the optimal solution to the problem ( fTSl ) and 
represents the optimal cost achieved by an idealized T-slot 
lookahead policy implemented over the rth frame of size T. 
The constant _D is a finite constant that satisfies for all t and 
all possible control actions that can be implemented on slot t: 



D > 



1 ^ 

2 ^ 

M 

+ E 

m — 1 
K 



ht^f\xUt)-7m{t)\ 



1 



qf"^^ max[bk{t),ak{t)] 



2 ^ 

k=\ 



(41) 



where zf''^^ , , h^^^ represent the maximum change in 
queues Zi{t), Qk{t), Hmit) over one slot, given by: 



\yi; 



■ 9T 



diff 

Ik 



„min I 
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Such a finite constant exists by the boundedness assumptions, 
and a value of D that satisfies inequality ( |4TI ) is given in 
Appendix E. 

Finally, if initial queue backlogs are 0, the final term in (|40] | 
is bounded by: 



\Hm{RT) — H, 




M 

^ RT 

m—l 

Proof: See Appendix A. □ 



C. Discussion of Theorem Q] 

Consider the simple case when queues are initially 0, 
gi{x) — for all I (so that jii^m = 0), and where we use 
a C-approximate algorithm for some constant C > 0. Fix a 
positive integer frame size T. Theorem [T] can be interpreted 
as follows: The algorithm implemented over t^nd = RT slots 
ensures that the desired constraints dUi-© are approximately 
met to within a "fudge factor" given by: 



D. Linear Cost Functions 

The auxiliary variables are crucial for optimization of time 
varying systems with non-linear cost functions f{x), gi{x). 
However, they are not needed when cost functions are linear 
(or affine). For example, let x{t) be a vector of attributes as 
defined before, and consider the problem: 

Minimize: ho{x) 

Subject to: hi{x) <{) \/l e {I, . . . ,L} 

a{t) e A^(t) Vi e {0, . . . , tend - 1} 

where hi{x) are affine functions (i.e., linear plus a constant), 
so that hi{x) = hi{x). This can of course be treated using the 
framework of (l3]l-(|7|i with ho{x) ~ f{x), and gi{x) ~ hi{x), 
yi{x) = 0. However, it can also be treated using (l3]l-(|7]i with 
f{x) ~ gi{x) ~ for all / e {1, . . . ,L} and yi{x) — hi{x) 
(noting that T/; = hi{x)). This latter method is advantageous 
because it has = f^i^m = for all I and m, which tightens 
the constraint inequalities (|27j and the cost guarantee ( |40] i. 
Thus, it is useful to exploit linearity whenever possible. 



Constraint fudge factor — Co ] 

where Co is the constant defined in part (a) of Theorem[T] This 
fudge factor is made arbitrarily small when RT is sufficiently 
larger than V. Further, the achieved cost is either smaller 
than the cost '^^=0 associated with an ideal T-slot 
lookahead policy implemented over R successive frames of 
length T, or differs from this value by an amount no more 
than a fudge factor that satisfies: 

Cost fudge factor = 

The value of V can be chosen so that CiT/V is arbitrarily 
small, in which case both the cost fudge factor and the 
constraint fudge factor are arbitrarily small provided that R 
is sufficiently large, that is, provided that we wait for a 
sufficiently large number of frames. 
The constants Ci and C2 are given by: 

Ci 4 {B + C -D)/T + D 

M 

C2 = Co Vrri 
m—l 

Note that C2 = in the case when f{x) = (so that = 
for all to). Finally, note that the value of T does not need 
to be chosen in order to implement the algorithm (we need 
only choose a value of V), and hence the above bounds can 
be optimized over all positive integers T. 

The above tradeoffs described by V and R hold for gen- 
eral problems of the type ©-(ITli, and can be tightened for 
particular problems such as the network problem described 
in the next section, which provides queue bounds that do not 
grow with time. A similar strengthening due to constant queue 
bounds can be shown for the general problem in the case when 
Assumption Al is strengthened to a "Slater-type" condition, 
as described in Section IVII 



E. Infinite Horizons and Ergodicity 

Consider now the problem ©-(iTll over an infinite horizon, 
so that time averages y^, x, ak, bk represent limiting averages 
over the infinite horizon. Suppose that Assumptions Al and 
A2 hold, and that we use a C-approximate algorithm so that 
Theorem [U applies. By taking a limit as tend — > 00, Theorem 
[T]? implies that all required infinite horizon constraints are met. 
Specifically, we have: 

\imsup[yiit)+giix{t))]<0 V? € {1, . . . , i} (42) 

t—^OC 

limsup[afe(i)-6fe(t)] <0 Vfc G {1, . . . , i^} (43) 
lim x{t,) e X (44) 

where yi{t), x{t), ak{t), bk{t) represent time averages over 
the first t slots, and where {ti} is any subsequence of times 
over which x{t) converges Q Further, Theorem[l]; implies that 
the infinite horizon cost satisifes: 

^ R-l 

lim [y,{RT) + f {x(RT))] < Hm -- ^ F; 

it— >oo 7?.— >oo H '—^ 



B 



r=0 

C D{T-l) 



(45) 

V V ' 

Consider now the special case when the random events 
{a;(0), a;(l), a;(2), . . .} evolve according to a general ergodic 
process with a well defined time average probability distribu- 
tion. In this case and under some mild assumptions, it can be 
shown that the optimal infinite horizon cost /* can be achieved 
over the class of stationary and randomized algorithms, that 
F* is close to /* for each r whenever T is sufficiently large, 
and that the term i X^^tTo^ T* converges to /* plus an error 
term that is bounded by 5{T), where 5{T) is a function that 
satisfies limy-i-oo 5{T) — 0. This is discussed in more detail 
in Appendix C. 

^Note that xit) is an infinite sequence (witli time index t) that talces values 
in a compact set, and so it has a convergent subsequence. 



9 



IV. A Simple Internet Model 

Here we apply the universal scheduling framework to a 
simple flow based internet model, where we neglect the 
actual network queueing and develop a flow control policy 
that simply ensures the flow rate over any link is not more 
than the link capacity (similar to the flow based models 
in [23] [24] [25] [26] [27]). Section [V] ti-eats a more extensive 
network model that explicitly accounts for all queues. 

Suppose there are N nodes and L links, where each link I G 
{1, . . . ,L} has a possibly time-varying link capacity Ci{t), for 
slotted time t e {0, 1,2,.. .}. Suppose there are M sessions, 
and let A„i{t) represent the new arrivals to session m on slot 
t. Each session m e {1, . . . , M} has a particular source node 
and a particular destination node. Assume link capacities and 
newly arriving traffic are bounded so that: 

< Ci{t) < cr°^ yt , o< Am{t) < A™"^ yt (46) 

for some finite constants (7™"^ and A™"^. The random 
network event Lu{t) is thus given by: 

iu{t)^[{Ci{t), {A,{t), . . . , AmW)] (47) 

Recall that uj{t) is an arbitrary sequence with no probability 
model. The control action taken every slot is to first choose 
Xm{t), the amount of type rn traffic admitted into the network 
on slot t, according to: 

< xUt) < A„,{t) 

Next, we must specify a path for this data, from a collection 
of paths Vm{t) associated with path options of session m 
on slot t (possibly being the set of all possible paths in the 
network from the source of session m to its destination)]! 
Here, a path is defined in the usual sense, being a sequence 
of links starting at the source, ending at the destination, and 
being such that the end node of each link is the start node of 
the next link. Let li.m{t) be an indicator variable that is 1 if 
the data Xm{t) is selected to use a path that contains link /. 
The (li.,n(i)) values completely specify the chosen paths for 
slot t, and hence the decision variable for slot t is given by: 

a{t)^{xi{t), . . .,XM{t)); (li,m(i))|(G{l,...,L},me{l,...,Af}] 

Let x = (xi, . . . ,xm) be a vector of the infinite hori- 
zon time average admitted flow rates, and let (j){x) — 
'^m=i't'"i{xm) be a separable utility function. Assume that 
each 4>m{x) is a continuous, concave, non-decreasing function 
in X, with maximum right derivative < oo. Our goal is to 
maximize the throughput-utility (j){x) subject to the constraints 
that the time average flow over each link / is less than or equal 
to the time average capacity of that link. The infinite horizon 
utility optimization problem of interest is thus: 

Maximize: I]m=i '/'m(^m) 

Subject to: J2m=i h^mXm <Ci yi e {1,...,L} 

^Strictly speaking, if there ai'e time varying patli choices then we should 
augment Lo{t) in (47) to include Vmit) for all m G {1, . . . , M}. 



where the time averages are defined: 

1 

Xm = lim -y^Xmir) 

>oo t ^ — ^ 

T = 

li.ma^m = lim - li,m(r)a;m(T) 

>oo t ^ — ^ 

r=0 

_ 1 

Ci A lim-VQ(r) 

T=0 

This is equivalent to minimizing the convex function f{x) = 
—(l){x), and hence exactly fits our framework. There is no set 
constraint so that X = M^^. As there are no actual queues 
Qk{t) in this model, we use only virtual queues Zi{t) and 
Hm{t), defined by update equations: 

M 

Zi{t + l) = max[Z((t) + lLr„(t)a:™(i) - Q(t),0] (48) 

'm— 1 

H„,{t+1)= Hra{t) + -1m{t) - Xra{t) (49) 

where ^m{t) are auxiliary variables for m e {1,...,M}. 
This is equivalent to the general framework with yi{t) = 
E™=i h^iMt) - Ci{t) and gi{-) = for ? € {1, ... , L}. 
Note that Assumption Al holds by the boundedness assump- 
tions (|46] |, and Assumption A2 holds because this system 
has an "idle" control action that admits no new data (so that 
yi{i) < for all / under this idle action). The general universal 
scheduling algorithm for this problem thus reduces to: 

« (Auxiliary Variables) Every slot t, each session m e 

{1,...,M} observes Hm{t) and chooses "im{t) as the 

solution to: 

Maximize: F<?!),m(7m(i)) " -ffm(07m(0 (50) 
Subject to: < 7„(t) < A^'"'' (51) 

This is a simple maximization of a concave single vari- 
able function over an interval. 
> (Routing and Flow Control) For each slot t and each 
session m e {l,...,Af}, observe the new arrivals 
Am{t), the queue backlog Hm{t), and the link queues 
Zi{t), and choose Xm{t) and a path to maximize: 

Maximize: x,n{t)H,n{t) - x,n{t) Y^f^i lum(t)Zi{t) 
Subject to: < Xm{t) < Am{t) 

The path specified by {li^m{t)) is in Vmit) 

This reduces to the following: First find a shortest path 
from the source of session m to the destination of session 
m, using link weights Zi{t) as link costs. If the total 
weight of the shortest path is less than or equal to H„i{t), 
then choose Xm{t) = A„i{t) and route all of this data 
over this single shortest path. Else, there is too much 
congestion in the network, and so we choose Xm{t) = 
(thereby dropping all data A,„(i)). 
• (Virtual Queue Updates) Update the virtual queues ac- 
cording to (081) and ( |49] ). 

The shortest path routing in this algorithm is similar to 
that given in [26], which treats a flow -based network stability 
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problem in an ergodic setting under the assumption that 
arriving traffic is admissible (so that flow control is not used). 

Assume for simplicity that all queues are initially empty. 
Then for any frame size T and any number of frames R, from 
Theorem [T];, we know the utility of this algorithm satisfies: 



(^) > 



— > 

r— 1 

_ ^ u^\H„,{RT)\ 

rn — 1 



D(T- 1) 
V 



RT 



(52) 



where x represents a time average over the first RT slots, 
and $* represents the utility achieved by the T-slot lookahead 
policy implemented over slots {rT, . . . ,rT + T —1}. Here we 
are assuming we use an exact implementation of the algorithm 
(a 0-approximation), so that C ~ 0. The constants B and D 
given in (|34] | and ( |4TI ) are simplified for this context without 
queues Qk{t), and are provided in Appendix E. 

Furthermore, the infinite horizon constraints are satisfied by 
(l42T i. and bounds on the virtual queue sizes for any time t > 
are also given in Theorem 1. However, with this particular 
structure we can obtain tighter bounds. Indeed, from the update 
equation for Hm {t) in (|49] l and the auxiliary variable algorithm 
defined in (fSOll-dSni. it is easy to see that: 

. If Hm{t) < 0, then 7,„(t) = A™"^^ and hence H„^{t) 
cannot decrease on the next slot. 

• If H„i{t) > Vv„i, then -fmit) — and hence Hm{t) 
cannot increase on the next slot. 

It easily follows that: 

- A^''^ < Hra{t) < Vv^ + e {0, 1,2,.. .} (53) 

provided that this is true for Hm (0) (which is indeed the case if 
Hm{0) = 0). Therefore, the final term in the utility guarantee 
is bounded by: 



E 

m— 1 



iy,n\H,n{RT)\ 
RT 



< 



m— 1 



' m ) 



RT 



which goes to zero as the number of frames R goes to infinity. 
We further note that the utility guarantee ( |52] | can be modified 
to apply to any interval of RT slots, starting at any slot to, 
provided that we modify the equation to account for possibly 
non-zero initial queue conditions according to (l40l l. 

Further, the fact that the queues Hm{t) are deterministic ally 
bounded allows one to deterministically bound the queue sizes 
Zi{t) as follows: For alH G {1, . . . , L} we have: 



< Zi{t) < Vv"'"'' + {M + 1)A" 



yt 



(54) 



where jy"^"-^ and A™"-^ are defined as the maximum of all ly^ 
and A"]"-^ values: 



v = max 

m£{l,...,M} 



max A" 

m£{l,...,M} 



The proof of this fact is simple: If a link I satisfies Zi{t) < 

Y^max j^max^ ^^^^ jj^g jjg^j gj^j j^^^g 2i{t + 1) < 

y-^max _|_ (jvf _(_ bccausc the qucuc can increase by at 

most MA™-"-^ on any slot (see update equation (|48]l). Else, if 
Zi{t) > Vv"^"^ + A™'^^, then any path that uses this Hnk 



incurs a cost larger than Vv™"-^ + A™"^, which is larger 
than Hm {t) for any session m. Thus, by the routing and flow 
control algorithm, no session will choose a path that uses this 
link on the current slot, and so Zi{t) cannot increase on the 
next slot. 



A. Delayed Feedback 

We note that it may be difficult to use the exact queue 
values Zi (t) when solving for the shortest path, as these values 
change every slot. Hence, a practical implementation may use 
out-of-date values Zi{t—Ti t) for some time delay t; < that may 
depend on / and t. Further, the virtual queue updates for Zi (t) 
in ( l48T l are most easily done at each link /, in which case the 
actual admitted data Xm{t) for that link may not be known 
until some time delay, arriving as a process Xm{t — n^m.t)- 
However, as the virtual queue size cannot change by more than 
a fixed amount every slot, the queue value used differs from 
the ideal queue value by no more than an additive constant 
that is proportional to the maximum time delay. In this case, 
provided that the maximum time delay is bounded, we are 
simply using a C-approximation and the utility and queue 
bounds are adjusted accordingly. A more extensive treatment 
of delayed feedback for the case of networks without dynamic 
arrivals or channels is found in [27], which uses a differential 
equation method. 



B. Treating Wireless Networks with this Model 

The above model can be applied equally to wireless net- 
works. However, an important extension in this case is to 
allow the link capacities (Ci(t), . . . , Ci(i)) to be functions 
of a network resource allocation decision (this is treated more 
extensively in Section |V]i. This resource allocation can be 
viewed as part of the network control action taken every 
slot. It is not difficult to show from the general solution in 
Section IIII-BI that the optimal resource allocation decision 
should observe Zi{t) values and choose capacities Ci{t) to 
maximize the following weighted sum: 



Y.Ci{r)Zi{t) 



1=1 

Depending on the network model, this maximization can be 
difficult, and is generally prohibitively complex for wireless 
networks with interference. Fortunately, it is easy to show that 
if the attempted max-weight solution comes within a factor 
6 of the optimum max-weight decision every slot (for some 
value 6 such that < 6* < 1), then the same utility guarantees 
hold with $* re-defined as the optimal T-slot lookahead utility 
in a network with link capacities that are reduced by a factor 
9 from their actual values. This follows easily by noting that 
such a 9-multiplicative-approximate algorithm yields a right- 
hand-side in the drift bound dSSl l that is less than or equal 
to the right-hand-side associated with the optimal max-weight 
decisions implemented on a network with ^^-reduced capacities 
(see also [1] for a more detailed discussion of this for the case 
of i.i.d. uj{t) events). 
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C. Limitations of this Model 

The deterministic bound on Zi {t) in (|54| | ensures that, over 
any interval of K slots (for any positive integer K and any 
initial slot t^), the total data injected for use over link / is no 
more than Vv™-""^ + {M + beyond the total capacity 

offered by the link: 

to+A'-l M ta+K-l 

T — to ni — l T—to 

While this is a very strong deterministic bound that says no 
link is given more data than it can handle, it does not directly 
imply anything about the actual network queues (other than 
the links are not overloaded). The (unproven) understanding is 
that, because the links are not overloaded, the actual network 
queues will be stable and all data can arrive to its destination 
with (hopefully small) delay. 

One might approximate average congestion or delay on 
a link as a convex function of the time average flow rate 
over the link, as in [28] [29] [27]. This can be incorporated 
using the general framework of Section [III which allows for 
optimization of time averages or convex functions of time 
averages. However, we emphasize that this is only an approx- 
imation and does not represent the actual network delay, or 
even a bound on delay. Indeed, while it is known that average 
queue congestion and delay is convex if a general stream of 
traffic is probabilistically split [30], this is not necessarily true 
(or relevant) for dynamically controlled networks, particularly 
when the control depends on the queue backlogs and delays 
themselves. Most problems involving optimization of actual 
network delay are difficult and unsolved. Such problems 
involve not only optimization of rate based utility functions, 
but engineering of the Lagrange multipliers (which are related 
to queue backlogs) associated with those utility functions. 

Finally, observe that the update equation for Zi{t) in (l48l l 
can be interpreted as a queueing model where all admitted 
data on slot t is placed immediately on all links / of its path. 
Similar models are used in [24] [25] [27] [31]. However, this is 
clearly an approximation, because data in an actual network 
will traverse its path one link at a time. It is assumed that 
the actual network stamps all data with its intended path, so 
that there is no dynamic re-routing mid-path. Section IV] treats 
an actual multi-hop queueing network, and allows dynamic 
routing without pre-specified paths. 

V. Universal Network Scheduling 

Consider a network with N nodes that operates in slot- 
ted time. There are M sessions, and we let A{t) = 
{Ai{t), . . . , Aiijit)) represent the vector of data that exoge- 
nously arrives to the transport layer for each session on slot 
t (measured either in integer units of packets or real units of 
bits). We assume that arrivals are bounded by constants A™"^, 
so that: 

< A„,{t) < A^'^- yt 

Each session m e {!,..., A/} has a particular source 
node and destination node. Data delivery takes place by 



transmissions over possibly multi-hop paths. We assume that a 
transport layer flow controller observes Am{t) every slot and 
decides how much of this data to add to the network layer at its 
source node, and how much to drop (flow control decisions are 
made to limit queue buffers and ensure the network is stable). 
Let {xm{t))\m=i be the collection of _^ow control decision 
variables on slot t. These decisions are made subject to the 
constraints: 

< x^{t) < A^{t) Vme {!,..., Af},Vt (55) 

All data that is intended for destination node c G {!,..., N} 
is called commodity c data, regardless of its particular session. 
For each n G {1, . . . , A^} and c € {!,..., iV}, let Mn^ 
denote the set of all sessions m G {!,..., M} that have 
source node n and commodity c. All data is queued according 
to its commodity, and we define Qn \t) as the amount of 
commodity c data in node n on slot t. We assume that 
Q^n\t) — for all t, as data that reaches its destination is 
removed from the network. Let Q{t) denote the matrix of 
current queue backlogs for all nodes and commodities. 

The queue backlogs change from slot to slot as follows: 

N N 

where (t) denotes the actual amount of commodity c data 
transmitted from node i to node j (i.e., over link on slot 

t. It is useful to define transmission decision variables (t) 
as the bit rate offered by link to commodity c data, where 
this full amount is used if there is that much commodity c data 
available at node i, so that: 

a!?W<4^W y^,J,ce{l,...,N},yt 

For simplicity, we assume that if there is not enough data to 
send at the offered rate, then null data is sent, so thatH 

N 

Qi^\t + 1) = max[Q(^)(i)-EA'SW'0] 

N 

+E/^-w+ E (56) 

This satisfies ([T]l if we relate index k (for Qk{t) in ([T]i) to 
index (n, c) (for Qn\t) in (l56Tl). and if we define: 

N 
N 

*A11 results hold exactly as stated if this null data is not sent, b ecau se 
the drift bound in Lemma |2] holds exactly wen the update equation i56i is 
replaced by an inequality <, see [1]. 
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A. Transmission Variables 

Let S{t) represent the topology state of the network on slot 
t, observed on each slot t as in [1]. The value of S{t) is an 
abstract and possibly multi-dimensional quantity that describes 
the current link conditions between all nodes under the current 
slot. The collection of all transmission rates that can be offered 
over each link of the network is given by a general 

transmission rate function C {I {t) , S {t))i\ 

where I{t) is a general network- wide resource allocation 
decision (such as link scheduling, bandwidth selection, mod- 
ulation, etc.) and takes values in some abstract set Is{t) that 
possibly depends on the current S{t). We assume that the 
transmission rate function Cij{I{t), S{t)) is non-negative and 
bounded by a finite constant for all I{t), and S{t). 

Every slot the network controller observes the current 
S{t) and makes a resource allocation decision I{t) E ^s{t)- 
The controller then chooses fJ.l^\t) variables subject to the 
following constraints: 

Ml?(i)>0 Vz,j,ce{l,...,7V} (57) 

m!-^W=mS'W = yi,j,cG{l,...,N} (58) 

N 

Y^fi^^ <ajm,S{t)) Vz,je{l,...,7V} (59) 

c=l 

B. The Utility Optimization Problem 

This problem fits the general model of Section|lI]by defining 
the random event as follows: 

^{t)^A{t)-S{t)] 

That is, the random event Ljj{t) is the collection of all new 
arrivals together with the current topology state. The control 
action a{t) is defined by: 

representing the resource allocation, transmission, and flow 
control decisions. The action space A^[t) is defined by the set 
of all I{t) e Ts(t), an {i)) that satisfy (l57ll-(|59ll, and all 
{xm{t)) that satisfy (|55T l. 

Define Xm as the time average of a;,,,, [t) over the first tend 
slots (as in Q), and define x as the vector of these time 
averages. Our objective is to solve the following problem: 

Maximize: (60) 
Subject to: a{t) G A^(^t) Vt G {0, . . . , tend - 1} (61) 
- I Y^A' —(c) ^ -t'^) 

V?i,cG {1,...,A^}(62) 

^It is worth noting now that for networks with orthogonal channels, our 
"max-weight" transmission algorithm (to be defined in the next subsection) 
decouples to allow nodes to make transmission decisions based only on 
those components of the current topology state S{i) that relate to their own 
local channels. Of course, for wireless interference networks, all channels 
are coupled, although distributed approximations of max-weight transmission 
exist in this case [1]. 



where 4>{x) is a continuous, concave, and entrywise non- 
decreasing utility function of the form: 

M 

(j>{x)= ^ (l>m{,Xra) 
m=l 

Define as the right partial derivative of (f>m{x) at a: = 0, 
and assume < t'm < oo for all m. Thus, this problem fits 
exactly into the general framework, satisfying Assumption Al 
by our boundedness assumptions, and satisfying Assumption 
A2 by the "idle" control action that admits no new data and 
transmits over no links. 

C. The Universal Network Scheduling Algorithm 

To apply the general solution, note that the constraints 
( l62b are upheld by stabilizing the actual queues Qn ^ (t) with 
updates (l56b . Because we have not specified any additional 
constraints, there are no Zi{t) queues as used in the general 
framework. However, we have auxiliary variables 7,n(<) for 
each m e {1, . . . , M}, with virtual queue update: 

H„,{t + 1) ^ H,n{t) + jmit) - x„,it) (63) 

The algorithm is thus: 

• (Auxiliary Variables) For each slot t, each session m S 
{1, . . . , M} observes the current virtual queue Hm{t), 
and chooses auxiliary variable 7m(<) as the solution to: 

Maximize: V4im{lm{t)) - Hm{t)^m{t) (64) 
Subject to: < 7™(i) < A™'^^ 

This is a maximization of a concave single variable 
function over an interval, the same as in the internet 
algorithm of Section |IVl 

• (Flow Control) For each slot t, each session m observes 
Am{t) and the queue values H„i{t), Qn^\t) (where 
denotes the source node of session m, and c™ represents 
its destination). Note that these queues are all local to the 
source node of the session, and hence can be observed 
easily. It then chooses x„i{t) to solve: 

Maximize: H„i{t)xm{t) - QnJ"\t)xjn{t) (65) 
Subject to: < Xm{t) < Am{t) 

This reduces to the "bang-bang" flow control decision 
of choosing Xmit) = Am{t) if QnZ\t) < Hjn{t), and 
Xm{t) — Otherwise. 
« (Resource Allocation and Transmission) For each slot t, 
the network controller observes queue backlogs {Q^f'(t)} 
and the topology state S{t) and chooses I{t) G 1-s{t) and 
{^J!f^{t)} to solve: 

Max: E„,c Qi'^ {t) Ef=i (t) ~ E^i l^ {t)] (66) 
S.t.: lit) € Js(t) and (|57li-(|59ll 

• (Queue Updates) Update the virtual queues Hm{t) ac- 
cording to (|63] | and the acutal {t) according 
to (|56li. 
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The exact decisions required to implement the resource 
allocation and transmission component are described in Sub- 
section IV-DI below. Before covering this, we state the per- 
formance of the algorithm under a general C-approximate 
implementation of the above algorithm. For simplicity, assume 
all queues are initially zero. By Theorem [T]; we have for any 
integers R>0,T>0: 

A/-^ ^ 1 ^* B + C D{T-1) 
(h(x) > — > $! ^ - 

r=0 



M 

E 



RT 



(67) 



While Theorem [T] also provides a bound on the final term, 
and bounds on all queue sizes, we can again provide tighter 
constant queue bounds by taking advantage of the flow control 
structure of the problem. Indeed, by the same argument that 
proves ( |53] ) in Section|IV] we have that Hm{t) cannot decrease 
if it is already negative, and cannot increase if it is beyond 
Vvm, SO that for all m e {1, . . . , M} we have: 



4" 



(68) 



provided that these bounds are true for -ffm(O) (which is 
indeed the case if i?m(0) ~ 0). Therefore, \H„i{t)\ < Vv,n + 
A™"^, and the utility bound (|67] | becomes: 



> 



ly^* B + C D{T-1) 
i? ^ " V 

r=0 



V 



M 

■E 

m— 1 



RT 



(69) 



The values of B and D in i69i for this context are given in 
Appendix E. Note that this yields a "utility fudge factor" of 
the form as indicated in the introduction of this paper: 

BiT B2V 



utility fudge factor 



where: 



V 



M 



RT 



BiHb + c- d)/t + d , B2^J2 ^™(^™ + ^r7v^) 

m— 1 

The value of C used in the above bound is equal to if 
we use a 0-approximation, being an exact implementation of 
the above algorithm. In the next subsections we purposefully 
engineer a C-approximation for a nonzero but constant C so 
that we can additionally provide deterministic bounds on all 
actual queues Qn \t). 

D. Resource Allocation and Transmission 



By switching the sums in ( |66] |. it is easy to show that the 
resource allocation and transmission maximization reduces to 
the following generalized "max- weight" algorithm (see [1]): 
Every slot t, choose I{t) G 2^s(t) to maximize: 



N N 



where Wij (t) are weights defined by: 

W^J{t)A max m&x[wj;-\t),0] 

where wj^fit) are differential backlogs: 
The transmission decision variables are then given by: 

where c*j{t) is defined as the commodity c e {1,...,N} 
xii 

arbitrarily) 



a,iIit),Sit)) if c = and W,^f{t) > 

I otherwise 



that maximizes the differential backlog W^//'' (t) (breaking ties 



E. A C -Approximate Transmission Algorithm 

Rather than implement the exact transmission algorithm 
in the above subsection, we present here a useful C- 
approximation that yields bounded queues (see also [9] [32]). 
Define wj;!^\t) as foUows: 



— 1 otherwise 



where for each n e {!,..., N}, f3n is defined as the largest 
amount of any commodity that can enter node n, considering 
both exogenous and endogenous arrivals (this is finite by 
the boundedness assumptions on transmission rates and new 
arrivals), and where Q™"^ is defined: 

^max A max ^ j^max i praax 



(71) 



where jy™'*^, j^^ax^ pmax given by: 

,max A 



max 
ie{i,...,M} 



j=i j=i 



j^max A j^rnax 
me{l,...,M} 

Z?™"^ 4 max /3„ 

ne{l,...,N} 

(c) 

Finally, the values 6*- are any non-negative weights that 
represent some type of estimate of the distance from node i to 
destination c (possibly being zero if there is no such estimate 
available). Such weights are known to experimentally improve 
delay by biasing routing decisions to move in directions closer 
to the destination (see [6][1][9]). Then define Wij{t) as: 

max maxlwj^f (t) , 0] 
ce{i,...,Ar} ' 

and choose I{t) e T^{t) to maximize: 

N N 

(72) 

i=l J=l 

and choose transmission variables: 

(c) ^ r a,{I{t),S{t)) if c = c^(t) and W,^f{t) > 
^ ' \ oflierwise 

(73) 

where c*{t) is the commodity c E {!,..., N} that maximizes 



14 



F. Bounded Queues 

Lemma 3: (Bounded Qn\t)) Suppose auxiliary variables 
and flow control decisions are made according to ( |64] l and ( |65] ). 
with update equations ( |63] l and ( |56l ). Suppose that I{t) e 2^s(t) 
is chosen in some arbitrary manner every slot t (not necessarily 
according to (|72]|). but that transmission decisions are made 
according to ( |73] | with respect to the particular /(<) chosen. 
Then for all t we have: 

Qr^f^t) < Q""'"' = Ft/™"^ + ^'"'^^ + Z?"*"^ (74) 

provided that this inequality holds at i = 0. 

Proof: Suppose that Q^i^t) < Q""" for aU n,c for a 
particular slot t (this is true by assumption on slot t = 0). 
We prove it also holds for slot t + 1. First suppose that 
Qn\t) < Q™'"'^ — Pn- Then, because /3„ is the largest amount 
of new arrivals to queue Q']f' [t) over one slot (considering 
both endogenous and exogenous arrivals), it must be that 

Q^n\t + i) < g™"". 

Consider now the opposite case when Q™"^ — /3„ < 
Qn\t) < Q"^""". Then from dTOll we see that w\^^ = -1 
for all links (i, n) over which new commodity c data could be 
transmitted to node n from other nodes. Thus, by ( |73] ) we see 
that no commodity c data will be transmitted to node n from 
any other node. Further, We have: 

for all m e {1, . . . , M}, where the first equality follows by 
definition of Q™"^ in ( TTTT i and the final inequality follows 
by (|68] l. It follows by the flow control decision ( |65] l that 
a^m(i) — for all sessions m that might deliver new data 
to queue Qi^'(i). Thus, no new commodity c data (exogenous 
or endogenous) arrives to node n on slot t, and Q^n^ [t + 1) < 

gi''(i) < Q™""- □ 

The queue bound (l74l i in the above lemma provides the 
strong deterministic guarantee that all queues are bounded by 
a constant that grows linearly with the V parameter. Thus, 
while increasing V can improve the terms {B + C) jV and 
D(r — V)lV in the utility guarantee ( |69] l, a tradeoff is in the 
linear growth with V in queue congestion ( |74] i, as well as the 
increase in the number of frames R required for the final term 
in the utility bound ( |69] l to decay to near-zero. 

While it is intuitive that the above algorithm produces a 
C-approximation for some constant value C, we complete 
the analysis below by formally showing this. Additionally, 
we note that a ^-multiplicative-approximate solution to ( |72] | 
leads to utility guarantees where <i>* is re-defined as a T- 
slot lookahead utility on a network where the C{I{t),S{t)) 
function is replaced by 6C{I{t), S{t)), which holds for the 
same reason as described in Section IIV-BI 

G. Computing the C value 

Lemma 4: Using the modified weig hts W^f{t) in ^ 
results in a C-approximation of the max-weight resource 
allocation and transmission scheduling problem ( [66] l. with: 

C42C,„„[/3"°^ + 0d.//] (75) 



where Csum is the largest possible sum of transmission rates 
'^^j Cij{I{t), S{t)), summed over all links and considering 
all possible S{t) states and I{t) decisions (being finite by 
the boundedness assumptions on all links), and Odiff is the 

(c) (c) 

maximum difference in 9) ' and 6) ' , maximized over all node 
pairs and all commodities c. 

Proof: Because all queues Qn\t) are upper bounded by 

Qv^ax^ if Q^f^{t) > Q"°^ - Pj, then max[Wi^/^(t), 0] = 
max[Qf'(i) - Qf\t),0\ < fij. It follows that: 

I max[W,(^)(t), 0] - max[iy/;'(i), 0]| < + 0^,// 

It follows that: 

Therefore: 

N N 

\Y^Y.^,,{i{t),smw,,{t)-wM\ 

i=l 3 = 1 

N N 

< E E w ' ^ w) t/^™" + ^*//] 
j=i j=i 

where Csum is the maximum sum rate over all links 
on any slot. Now let /*(<) be the maximum of 
Y.^JC^J{I[t),S{t))W,J{t) over Ts{t), and let i[t) be the 
maximum of Y^ij Cij{I{t), S{t))W^,{t). Then: 

> Y.a,ii{t),s{t))m,{t)-c/2 

> Y.a,irit),s{t))w,,{t)~c/2 

ij 

> Y.a,{rit),sit))w,,{t)-c 

It follows that the resource allocation I{t) (and the corre- 
sponding transmission decisions given by ( 173] )) yields a C- 
approximation. □ 

VI. Approximate Scheduling and Slater 
Conditions 

Here we replace Assumption A2 with a stronger assumption 
that states the constraints can be satisfied with 6 slackness. 
This is related to a Slater condition in classical static optimiza- 
tion problems [33]. It allows all queues to be deterministically 
bounded. It also allows performance analysis for implementa- 
tions when the error in the attempted minimization of the right 
hand side of ( |38] ) is off by more than just a constant C, such 
as an amount that may be proportional to the queue backlog 
(similar to the ^-multiplicative-approximations discussed in 
Section HV^ . 

Assumption A3: There exists a value S > such that for 
all u! e {<^(0), . . . ,uj{tend — 1)}, there is at least one control 
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action a', € Au, that satisfies: 



where Psum is defined: 



x{a'i^,uj) + e e A" 

for all vectors e = (ei, . . . , em) with entries e„i that satisfy 
|em| £ 5 for all m G {1, . . . , M}. Further, assume that: 

X™" + S < x^{ai,u;) < - <5 Vm e {1, . . . , M} 

This final assumption is mild and can easily be engineered 
to be true by convexly extending the range of the convex 
functions /(•), gi{-) by S in all directions (so that a;™™ is 
decreased by 6 and is increased by S), as in [34]. 

Define a C {t)-approximation as an algorithm that, every slot 
t, observes the current queue states and choses a control action 
that comes within C{t) of minimizing the right hand side of 
(l38T l, where C {t) is a value that can depend on t. Suppose that 
we implement the universal scheduling algorithm of Section 
IIII-BI using a C(t) -approximation with C{t) that satisfies the 
following for all t: 



C{t) < C + Vev + J2Zi{t)ez 



1=1 



K 

E 

fc=i 



Qk{T)eQ 



M 

E 

rn — l 



H^{t)\eH (76) 



where C, ev, ez, eg, en are non-negative constants. Note that 
this is a C-approximation if ey = = eg = e// = 0, and is 
the exact minimization of (|38] l if we additionally have C = 0. 

Theorem 2: Suppose Assumptions Al and A3 hold for 
some (5 > 0. Consider any C(t)-approximate algorithm that 
satisfies (|76] l every slot t, and assume that: 



eg < (5 , CH < S 

J2l3un<Syie{l,...,L} 



M 



Let the random event sequence {aj(0), . . .} be arbitrary. 
Suppose all initial queue backlogs are zero. Then: 

(a) All queue backlogs are bounded, so that for any slot 
t > we have: 



]k{t),Zi{t),\Hrn{t)\ < 



where 



where Di, D2, are constants defined as: 

B + C 



V 

D2 = 2D9^/V^ 



(r 



/ j + ev 



D3 



V 



where D is defined in (l4Tl i. z.^ax is the maximum over all 
zf^^^ , ql^^^ , and h'^^J^ constants, and where 6 is defined: 



5- eg, 5- eH 



(77) 



M 

/e{i,...,L} ^ 

m— 1 



In the special case when e^ = eg = e/f = 0, we have 9 = 
(5/(l + /3,„„). 

(b) For any designated time tend > we have: 



Vi+giix) < 



et 



end 



M 

E 

rn — l 



ak < b, 



Pi,. 
VCh_ 
x + e{t) e X 



V/e{i,...,L} 



VfcG{l,...,if} 



where e{t) = {ei{t), . 



■ , eM{t)) has entries that satisfy: 



m < 



c) Consider any positive integer frame size T, any positive 
integer R, and define tend = RT. Then the value of the system 
cost metric over t^nd slots satisfies: 



R 



B 



C + D{T-l) 



r=0 



M 



where p is defined: 



p= max 



V 



^^RT- ^^^^ 



5 ' en 



e_ff 



and where D is a constant defined in ( 1891 ), S is a constant 
defined in (l34b . and C3 is a constant defined in part (a). 
Proof: See Appendix D. □ 
The cost bound ( fTST i can be understood as follows: The last 
term on the right hand side goes to zero as R increases (and 
is equal to for all i? if /(•) = so that Vm = for all 
to). The second to last term can be made arbitrarily small 
with a suitably large V . Finally, if ez, eg, en, ey are small, 
then p is small and the remaining terms on the right hand side 
are close to the cost T^^Zq which is associated with 
implementing the T-slot lookahead policy over R frames. 

VII. Conclusions 

We have developed a framework for universal constrained 
optimization of time averages in time varying systems. Our 
results hold for any event sample paths and do not require a 
probability model. It was shown that performance can closely 
track the performance of an ideal policy with knowledge of the 
future up to T slots, provided that we allow the number of T- 
slot frames, denoted by R, to be large enough to ensure that the 
error terms decay to a negligible value. This framework was 
applied to an internet model and to a more extensive queueing 
network model to provide utility guarantees with deterministic 
queue bounds for arbitrary traffic, channels, and mobility. 
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Appendix A — Proof of Theorem[T] 

We first prove parts (a) and (b) of Theorem [T] 
Proof: (Theorem [T] part (a)) Let 7(t) and a{t) represent 
the decisions made by the C-approximate poHcy on slot t, 
which necessarily satisfy constraints (l2ni-(|23]l. Because these 
decisions come within C of minimizing the right hand side of 
dSST l over all other possible decisions, we have from ( l38T i: 

^i{t) + VUa{t),Lo{t)) + Vf{-f{t)) < 
B + C + Vyaia*it),u{t)) + Vfi-f*it)) 

L 

1=1 

K 

+ J2Qkit) ["fc (a* (t) ,^{t))-bk {a* {t) ,uj{t))] 

M 

+ Hra{t)[lUt) - x^{a*{t),um (79) 

m— 1 

where a* [t) and 7* [t) represent any alternative decisions that 
could be made on slot t that satisfy (I2ni-(|23]). 

Now choose a*{t) — a'^^^^y where G -^^(t) is the 

decision known to exist by Assumption A2 that satisfies: 

yi{a*{t),Lo{t)) + gi{x{a*{t),Lu{t))) <Q V?e{l,...,i} 
ak{a*{t)Mt)) < h{a*{t),uj{t)) Vke {!,..., K} 
x{a*{t),uj{t)) e X 

Further, choose 7*(t) ~ x{a* {t),uj{t)). These decisions 
satisfy (|2TI)-(|23]|, and plugging these decisions directly into 
the right hand side of ( |79] l yields: 

Ai{t) + VyoHt),u:{t)) + F/(7W) < 
B + C + VUo^*{t),Lo{t)) + Vf{-i-*{t)) 

Rearranging terms and using the bounds y™™, y™"^ and 

jvain jmax yjgjjjg' 

Ai(i) < B + C + F(yJ"''^ - yJJ"") + - /""«) 

Let the right hand side of the above inequality be denoted by 
P. Using the definition of Ai(t) thus gives: 

L{&{t + l))-L{@{t))<P 

The above holds for all t>Q. Summing over r G {0, . . . ,t — 
1} (for some time t > Q) and dividing by t yields: 

\[L{&{t))-L{@{m<p 

Using the definition of L{®{t)) in (|32T i proves part (a). □ 
Proof: (Theorem [T] part (b)) From part (a), if all queues are 
initially empty (so that _L(0(O)) = 0), we have for all slots 
t > 0: 

Zi{t),Qk{t), \Hrn{t)\ <CoVtV (80) 

Plugging dSOb into ( |26] |. dZTl ). ( |28] ) of Lemma [T] proves part 
(b). □ 

To prove part (c) of Theorem [l] we need the following 
preliminary lemma. 

Lemma 5: For any initial time tg, any queue values @{to), 
any integer T > 1, and any collection of C-approximate 



decisions that are implemented over the T-slot interval r e 
{io, . . . , to + r — 1}, we have: 

to+T-l 

Ar(to)+ [l^yo(a(T),w(r))+F/(7(r))] < 

T = to 

BT + CT + DT{T ~ 1) 

to+T-l 

+ [^yo(a*(r),c.(r))+V^/(7*(r))] 

l = \ T = to 

K to+T-l 

+ E [afc(a*(T),c.(r))-Sfe(a*(T),w(r)) 

fc=l T = to 

M to+T-l 
m— 1 T— to 

for any alternative decisions a*(T), 7*(t) over r G 
{to, . . . , to + r - 1} that satisfy (|2I]i-<|23]l. The constant B 
is defined according to (l34l i and _D is defined by dTTT i. 

Proof: (Lemma |5]l Because our policy is C-approximate, 
for all slots t it comes within C of minimizing the right hand 
side of ( |38] |. Hence, for all r G {to, . . . , to + T — 1} we have: 

Ai(t) + Fyo(a(T),w(T)) + y/(7(T)) < 
i? + C + Vyo{a*{T),uj{T)) + V^/(7*(t)) 

+ E^;(T)[yK«*W,^(r))+5;(7*(r))] 

if 

-FEQfcM[«fc(«*W''^W)-^fc(«*M'^W)] 

+ E Hra{T)[^UT) - (a* (t) , (t) )] (81) 
7n=l 

However, by definition of zf^^^ , qf^^^ , hf^ff, the queues 
Zi{t), Qk{t), Hm{t) can change by at most these values on 
each slot, and hence for t G {to, . . . , to + T — 1} we have: 

\Zi{t) ~ Zi{t^)\ < zfff-{T~to) 
\Qkir) - Qkito)\ < qt"-ir-to) 

\Hrn{T) - Hrn{to)\ < h^^ ^ ■ {t - t^) 

Using these in (ISTT i gives: 

Ai(t) + Vyo{a{T),u:{T)) + F/(7(r)) < 
B + C + • (r - to) + VyQ{a*{T),uj{T)) + F/(7*(t)) 

L 

+ E^;(^o)[yK«*W,c.(r))+,gi(7*(r))] 
1=1 

K 

+ EQfc(to)[afc(a*(r),L.(T))-6fc(a*(r),^(T))] 

k=l 

M 

+ E ^™(^o)[7.;;.(t) - Xra{a*{T)Mr))] 

rn—1 
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where D is defined in ( 1411 1. Summing the above inequaUty 
over T ^ {t^^ . . . + T — 1} yields the resuh, where we use 
the fact that: 

to+T-l 

^ (r-to)=T(T-l)/2 

□ 

We can now prove Theorem [T] part (c). 

Proof: (Theorem [T] part (c)) Fix integers r > and T > 0. 
Fix e > 0, and let Q!*(r) represent the decisions over the 
interval t e {rT. ...,(?' + 1)T — 1} that solve the problem 
(fTSI l and yield cost that is no more than F* + e. Let 7*(t) be 
constant over r S {rT, . . . , (r + 1)T — 1}, given by: 



7*(r) 



1 

T 



(r+l)T-l 



^ x{a*{t),u,{t)) 



t=rT 



Plugging these alternative decisions a*(r) and 7*(t) into the 
result of Lemma |5] for — fT yields: 

(r+l)T~l 

AT{rT)+ J2 [VyoHT),cj{T))+Vf{j{T))]< 

T=rT 

BT + CT + DT[T - 1) + VT{F; + e) 

The above holds for all e > 0, and hence we can take a limit 
as e ^ to remove the e in the final term. Define t^^nd = RT 
for some positive integer R. Summing the above over r E 
{0, . . . , i? - 1} and dividing by VRT yields: 



Vo + fil) 



L{®{RT)) - L{@m 



< 



R-l 

r=0 



VRT 
B + C D{T-1) 



V 



V 



(82) 



where and 7 represent time averages over the first tend 
slots, and where we have used Jensen's inequality in the 
concave function 7(7). 
However, we have by ( fT2] i: 



M 

m— 1 
M 



= f{i) + 



in—l 



RT 



where the final equality holds by (|29l l. Using this in 
together with the fact that L{-) > yields the cost bound 
(|40] | of part (c). Finally, if initial queue backlogs are 0, by 
(|80] l applied to time t ~ RT we have: 

\H,n{RT) - H„^iO)\ 



RT 



\H„,{RT)\ ^ CoW 



RT 



□ 



Appendix B — Conditions for Achievability of F* 

Consider the following additional assumption. 
Assumption A4: We have either one of the following two 
conditions: 



1) For all u) e {w(0), . . . ,uj{tpnd ^ the control action 
space contains a finite number of actions. 

2) For all w G {a;(0), . . . , aj(te„ji — 1)}, the set A^^ is a 
compact subset of for some dimension c, the functions 
y/(a, w), di;(Q;, w) are lower semi-continuous over a E 
Aui, the functions bk{a,Ld) are upper semi-continuous 
over a E A^, and the functions Xm{a, cj) are continuous 
over a E A^ ■ Note that all continuous functions are both 
upper and lower semi-continuousH 

Lemma 6: Suppose Assumptions Al, A2, A4 hold for 
given values {ijj{0), . . . ,uj{tend — !)}■ Then the infimum 
value F* for the problem (I3]l-(l7]i can be achieved by a 
particular (possibly non-unique) sequence of control actions 
{a*(0), . . . ,a*{tend ^ !)}■ That is, these actions satisfy the 
feasibility constraints (|4]l-(|7]l, and yield: 



yo + fixi,...,XM) = F* 



where: 



^ te,.d~i 

= J2 Xm{a*{T),u;{T)) Vm €{!,..., M} 

tend 

and where Ijq is similarly defined as a time average over r E 

{0, . . . , tend — !}■ 

Proof: We already know that Assumption A2 implies the 
existence of a feasible sequence of control actions. Thus, the 
infimum value F* of the cost metric over all feasible policies 
is well defined, and by Assumption Al it must satisfy: 

min I rniin ^ t^-^ ^ max , rniax 

Uo + J — ^ — wo J 

Consider now the case when the first condition of Assumption 
A4 holds. Then there are only a finite number of possible 
control sequences over the horizon {0, 1, ... , tend ~ 1}, and 
so there is one that achieves the minimum cost value F*. 
The case when the second condition of Assumption A4 holds 
can be proven using the Bolzano- Wierstrass Theorem together 
with a simple limiting argument, and is omitted for brevity. □ 

Appendix C - Ergodicity 

Consider the infinite horizon problem discussed in Section 
IIII-EI Suppose that the random events {w(0), w(2), . . .} 
evolve according to a general ergodic process with a well 
defined time average probability distribution. Specifically, let 

represent a finite (but arbitrarily large) outcome space for 
uj{t), and for each lj E fl assume that there is a steady state 
value 7r(cL>), such that: 

1 

lim - > ^uj{t) — 7r(w) with probability 1 

T = 

where 1cj(t) is an indicator function that is 1 if uj{t) = lu, and 
zero else. Further, assume the limiting probability converges 

*A function b{ot) is upper semi-continuous over a £ A- if for any 
ct G »4, we have fe(ct) > limn_>oo b{f3^) for all sequences G A 
such that lim„_>oo /3„ = ct. A function is lower semi-continuous if the 
inequality is reversed. All bounded functions that are discontinuous only 
on a set of measure zero can be easily modified to have the desired semi- 
continuous property by appropriately re-defining the function value at points 
of discontinuity. Most systems of practical interest have the desired semi- 
continuity properties. 
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uniformly to the steady state value, regardless of past history, 
so that: 

\Pr[uj{t + to) = io\ Hi story {to)] — 7r(ci;)| < error(t) 

where History{to) represents the past history of the process 
up to slot to, and where error{t) is a function that decays to 
as t — oo, regardless of the past history. This is related to 
the decaying memory property in [15] and the admissibility 
assumptions in [4][1]. 

In this case, it can be shown that the optimal infinite horizon 
cost, denoted /*, can be achieved over the class of stationary 
and randomized algorithms that make (possibly probabilistic) 
decisions for control actions on each slot t based only on the 
current state Lu{t) (see [4] [6] [32] for related proofs) Further, 
under mild conditions (such as the existence of a value 6 > 
for which the Slater condition of Assumption A3 in SectionlVll 
is satisfied), the value of F*, being the optimal time average 
cost under the T-slot lookahead policy over the T-slot interval 
starting at time rT, satisfies for any integer r > 0: 



lim F; 



f* with probability 1 



That is, regardless of the past history before time rT, the 
T-slot lookahead policy over a very large T approaches the 
optimal /*. The reason the "mild" additional conditions, such 
as the Slater condition, is needed, is that F* requires all 
constraints to be exactly satisfied by the end of the T slots, 
whereas the infinite horizon problem does not require this. 
Because of the uniform error decay, we have: 

iiE{i^;}-ri <'5(r) 

where S{T) is a function such that d{T) — > as T — > oo. 
Therefore we can write: 

f; = r + Sr 

where Sr is a random variable that satisfies |E{(5r} | < S{T) 
for all r. 

Using the definition of Sr, it follows from ( |45] l that time 
average cost satisfies for any integer T > 0: 

B + C D{T-l) 



\im[yoiRT) + f{x{RT))] < f* 



V 



V 



R-l 



lim — 5> 



r=0 



Under mild conditions, such as when evolves according 

to a finite state ergodic Markov chain, law of large number 
averaging principles imply that the last term, being a time 
average of the 8^ values, is bounded in absolute value with 
probability 1 by (5(T), a term that is negligibly small for 
large values of T. We can choose a large T provided that 
we also compensate with a large V to make the D(T — X)IV 
term neglibible. This demonstrates that, with probability 1, the 
algorithm implemented over an infinite time horizon yields 

'Similai' results on optimality of stationary policies can typically be 
achieved when the cardinality of the set Q is infinite, although steady state 
time averages and uniform convergence to a steady state are more awkward to 
deal with in this case. The easiest such arguments for (possibly uncountably) 
infinite sets H are for a;(t) processes that are i.i.d. over slots. 



cost that can be pushed arbitrarily close to the optimal value 
/* if V is suitably large. 

Appendix D - Proof of Theorem |2] 

Proof: (Theorem 12^) Define 9 as the positive real number 
that solves the following problem (it can be shown that the 
solution is given by (ITTIi): 

Maximize: B (83) 

Subject to: 1^ S — eq 

0<S~eH 

Following the proof of TheoremdH and replacing C with C{t) 
we have (compare with (|79]l): 

AiW + Vyoia{t),oj{t)) + Vfi-f{t)) < 
B + C + Vyo{a*{t),Lu{t)) + Vf{j*(t)) + Vey 

L 

+ Y,Zimz+yi{a*{t),uj{t))+gi{Y-m 



1=1 



K 



- ^ Qfc (0 [eg +k{.a* {t) ,uj{t)) - (a* {t) ,u:{t))] 
fe=i 

m—1 

M 

+ i/™W[7;„W - x^m{a*{t)Mm (84) 



m— 1 



where a* (t) and 7* (t) represent any alternative decisions that 
could be made on slot t that satisfy (l2ni-(l23Tl. 

Now choose a*{t) = ct'^^^y where Q!^(t) € A^{t) is the 
decision known to exist by Assumption A3. Further, choose 
Y{t) = y{t), where j'it) - (7™(t))|^Li is defined such 
that for all m e {1, . . . , M}: 



> 
< 



(85) 

This is feasible because of the last inequality in Assumption 
A3 together with the fact that: 

-S < -en - e<eH + 0<S 

With these choices, (l84] i becomes: 



Ai(t) < 

B + C + V{y'S"''' - 2/r") + V^(/™=" - /™") + Vev 



J2Zi{t)[S-ez-{eH + e)J2Pi,m] 

1=1 m=l 
K 

M 

J2 \Hm{t)\e (86) 



fc=l 



m—1 
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where have used the fact that, from (fTST i: 

9ih*{t)) < gi{x{al^,yu;{t))) + {en + 0) ^ Pi, 



M 



Now define P as: 

P^B + C + Vivl^"'' - yl^"') + - /™™) + Vev 

Because the value 6* is a bound on all the terms multiplying 
queue values in (|86T l, we have: 

L K M 

Ai(i) <P-eyzi{t)-eyQk{t)^ey\H„,{t)\ (87) 



1=1 



m— 1 



It follows that the drift is non-positive whenever the sum of the 
absolute value of queue size is greater than or equal to P/ 9. 
It is not difficult to show that the largest possible value of 
L{@{t)) under the constraint that the sum of absolute queue 
values is less than or equal to P/9 is (l/2)(P/0)^. Hence, 
if the Lyapunov function is larger than this value, it cannot 
increase on the next slot. However, if the absolute sum on slot 
t is less than or equal to P/0 we have: 



1=1 

+ ^E(Q.(i)+'?f-''(i))' 

k=l 
M 

m— 1 

< L{&{t)) + D + 

< {l/2){P/ef +D + z„,axP/e (88) 

where zf^^^ {t), q^^^ {t), h'^^f{t) represent the absolute value 
of the change in Zi{t), Qk{t), Hm{t), respectively, over one 
slot, having maximum absolute value given by zf^-^^ , q^^^ , 
and hf^^-f, D is defined in (l4Tl i. and z,nax is the maximum 
over all z^^^^ , qj^^^ , and h*^/^ constants. 
It follows that for all t we havePl 

L{®{t)) < {ii2){Pief + D + Zr^axP/e 

Therefore, all queues are bounded by: 



Qk{t),Zi{t),\H„,{t)\ < ^{P/eY+2D + 2z„,axP/e 
This bound is given by: 

V^P^/V^ + 2De^/V^ + 2z^axOP/V^ 



_ V^/Di + D2 + D3 
9 

'"More precisely, the bound on L{&{t)) is clearly true for i = 0. 
Supposing it is true for slot t , we show it is true for slot t+l: If the absolute 
sum is greater than or equal to P/0 on slot t, then the Lyapunov value cannot 
increase on the next slot and so the bound also holds for slot t + l. Else, if 
the absolute sum is less than P/6 on slot t, then the bound again holds for 
slot t + 1 by the calculation (sS). 



where 

Di A 



B + C 
V 



D2 4 2De'^/V'^ 



V 



□ 

Proof: (Theorem IJJ)) The proof follows immediately by 
applying the queue bounds of part (a) to the constraint bounds 
(|26] |, (I27I 1, ( |28] | of Lemma [T] using initial queue values of 0. 

□ 

Proof: (Theorem]!};) Fix integers T > and r > 0. Similar 
to the proof of Lemma |5] we have by replacing C with C(t) 
(compare with the bound in Lemma |5]): 

rT+T-l 

AT{rT)+ y [l/yo(a(T),w(r))+F/(7(r))]< 

T=rT 

BT + CT + DT{T - 1) 

rT+T-l 

+ y [Vev + Vyo{c^*{T),Lo{T))+Vf{j*{T))] 

T=rT 
L rT+T-l 

+ yZi{rT) y [ez + yi{a*{T),u:{r))+gi{r{T))] 

1=1 T=rT 

K rT+T-l 

+ J2Q,{rT) y eg 

fc=l T=rT 

K rT+T-l 



k=l 



M 



+ y\H„.irT)\eH 



m=l 



M 



rT+T-l 

+ yH^{rT) y [7;;(r)-x™(a*(T),a.(r))] 

m=l T=rT 

for any alternative decisions a*(T), 7*(t) over r G 
{rT, ...,rT + T-l} that satisfy The constant B 

is defined according to (l34b and D is defined by: 

L K M 



m—1 



(89) 

where D is defined in ( 1411 1. Note that the above bound 
holds deterministically for all possible alternative (possibly 
randomized) policies. Hence, the bound also deterministically 
holds when the right hand side is replaced by the expectation 
over any particular randomized policy^ 

Consider now the following randomized decisions for a* (t) 
and 7*(t): With probability p (to be defined later), for all 
slots T e {rT, . . . ,rT + T - I}, choose a*(T) = c^i^T) ^"'^ 
7*(t) — -f'{T), where ck^jT-) and 7'(r) are the policies from 
the proof of part (a) associated with slot r. Specifically, a^j^) 

"Formally, this uses the fact that if b < . . . , Qm) for all vectors 

(01, . . . , Cfjvf) G "4 for some function tp{-), some set J^, and some constant 
b, then b < tp{Ai, . . . , Am) for any random vector {Ai,...,Am) that 
takes values in A, and hence 6 < E {ip{Ai , . . . , Am)}- 
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satisfies Assumption A3, and -/'{t) is given by dSST l. Else (with 
probability 1 - p), for all slots r e {rT, ...,rT + T-l} 
choose a*{t) ~ '^'L(t) !*{''') — 7"' where the decisions 
'^'L(t) t" solve (llST i and yield cost Note from our 

construction here that either all slots of the frame use the first 
policy (which happens with probability p), or all slots of the 
frame use the second. Considering the expectation of the right- 
hand-side under this randomized policy, we have: 



rT+T-l 



ATirT)+ J2 [Vyoic^ir),ujiT))+Vfij{T))]< 

T=rT 

BT + CT + DT{T - 1) + VTev 



rT+T-l 



1=1 



ez - p{S - {en + 0) J2 /^^"^ 

m— 1 

K rT+T-1 

k=l T=rT 
M rT+T~l 



T=rT 



Now choose the probability p to make all of the above 
queueing terms non-positive, as follows: 



p= max 



eg eg 
6'eH + 



where 



M 



Psum^ max Pi 
ie{i,...,L> 



m— 1 



This is a valid probability (so that < p < 1) by definition 
of 9 (being the solution of dSSll). Therefore: 

rT+T-l 

AT(rr)+ [^yo(a(T),L.(T))+F/(7(T))] < 

T=rT 

BT + CT + DT{T - 1) + VTev 
+{1- p)VTF; + pTVlyH'"'' + f"""! 

Summing the above over r G {0, 1, . . . , i? — 1}, using non- 
negativity of the Lyapunov function, all queues are initially 
empty, convexity of /(•), and dividing by RTV yields: 



Vo 



/(7) < ii-p)^YPr+ev+p{yr^ + rn 

r=0 

{B + C + D{T-l)) 
^ V 

'^For simplicity, we assume here that the optimum of problem U5) is 
achievable by a single policy, else just take a policy that comes within e of 
F* and let e 0. 



Finally, we have: 



M 



< /(7)+E 

7n—l 
M 

< /(7)+E 



m—1 

M 



< /(7)+E 



m—1 



RT 
0RT 



This proves Theorem |2};. 



□ 



Appendix E — The B and D Constants 
Values of B and D that satisfy (|34b and i4T[ are given by: 



B 



D = 



1=1 


m—1 


k=l 


+ 


1=1 


M 
m—1 


k=l 





(90) 



(91) 



where constants zf^^^ , hf^f^, q^^^^ are defined after equation 

(EB. 

The Internet Model: For the internet model of Section IIVI 
there are no queues Qk{t) and so we have q^^^^ — a™"^ = 



^■max ^ 0. We further have ht^f = Al 



and 



diff 

= max 



M 



cr 



El A max 



where 1/ ,„ is equal to 1 if it is possible for session m to 
ever be routed over link I, and zero else. Using this value of 
zf^^'^ in ( |90l ) and ( |9T1 ), the values of B and D for this internet 
context, and in particular for the utility bound ( |52l ), are: 



B=D=ij:izff^f+lf:iA, 



1=1 



The Dynamic Queueing Network Model: For the dynamic 
queueing network of Section |V] there are no Zi{t) queues and 
0. Further, h't^^ = A™"^. Because indices k of 



so z 



dtff 



queues Qk (t) in the general framework correspond to indices 
(n, c) for queues Qn\t) in the dynamic queueing network, 
the values of B and D satisfy (|34] | and i4T[ whenever the 
following holds for all t: 

M 

B > :5E(^r^)'+9E[^«^W' + °«'W'] (92) 



D > 



n,c 

M 

+^E(^r^) 



[tl^c)^rnax^^(^),max^ ^^^^ 



(93) 



21 



(c) ,rnax 



We use this form, rather than the more expHcit form ( |90] l, ( |9T1 ). 
because this often allows a tighter bound when we incorporate 
the structure of the network. Specifically, define constants 
^max,m^ ^max,sum^ x^'^'''™"^ as the maximum possible sum 
transmission rate into node n, sum of transmission rates into 
and out of node n, and exogenous arrivals to source node n of 
commodity c, respectively, over one slot. Note that xi 
is given by: 

(c).max \ ^ Amax 

Then we have for each n e {1, . . . , N}: 



< 



E 



N 



{c),max 



E 



AT 



E 



Xc),max 



< 



E 



N 



) 

n \ 2 



ce{l,...,W} 

max ,sum\2 



n 



+ y(a;(f)''"'^^)2 +2^™''^'" max bj^)' 



Therefore a value of B that satisfies ( |92] i is given by: 

5 = ^E[(^r^''"'")'+E(^n^^""')'] 



n=l 
N 



E^n 

M 



max [a;(f)''"''^l 



ce{i,...,A'} 



m — 1 

Finally, define e„ as follows: 

e„4 max max[6(f)''""^ al")'"""] 
ce{i,...,Af} 

Then for each n e {1, . . . , iV} we have: 



E max[6(, 



c) .maa; (c),maa;] 



max 



<e„^max[6l^)(t),aW(i)] 

C 

<e„^[6l^)(i) + aW(i)] 

C 

<e„[Mr"'™™ + E^"^''"°"] 



Therefore a value of D that satisfies ( |93T l is: 



D = 



M 

-T 

2 ^ 

m— 1 



/ A max \2 



^E^"[^" 



For example, consider a wireless network where data is 
measured in integer units of packets (assumed to have a fixed 
length). Suppose that at most one packet can be transmitted 
or received per node per slot, and that a packet cannot be 
transmitted and received on the same slot at the same node. 



Then we have /ij! 



1. Further, suppose 



there is at most one source at any given node (so that M < N), 
and no source can admit more than 1 packet per slot. Then 
X)c -^"^^ = 1 if node n is a source, and zero else, and 
e„ = 2 if node n is a source, and 1 else. There are M source 
nodes and N — M non-source nodes, and so B and D are: 

B = D = {N + AM)/2 
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